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Abstract. We describe a novel procedure for deciding when a mass-action model is incompatible 
with observed steady-state data that does not require any parameter estimation. Thus, we avoid 
the difficulties of nonlinear optimization typically associated with methods based on parameter 
fitting. Instead, we borrow ideas from algebraic geometry to construct a transformation of the model 
variables such that any set of steady states of the model under that transformation lies on a common 
plane, irrespective of the values of the model parameters. Model rejection can then be performed 
by assessing the degree to which the transformed data deviate from coplanarity. We demonstrate 
our method by applying it to models of multisite phosphorylation and cell death signaling. Our 
framework offers parameter-free perspective on the statistical model selection problem, which can 
complement conventional statistical methods in certain classes of problems where inference has to 
be based on steady-state data and the model structures allow for suitable algebraic relationships 
among the steady state solutions. 

keywords: chemical reaction networks, Grobner bases, mass-action kinetics, singular values, 
ordinary differential equations, algebraic statistics. 

many branches of science and engineering, one is often interested in the problem of model se- 
lection: given observed data and a set of candidate models for the process generating that data, 
which is the most appropriate model for that process? Such a situation commonly arises when the 
inner workings of a process are not completely understood, so that multiple models are consistent 
with the current state of knowledge. For mechanistic models, e.g., ordinary differential equation 
(ODE) or stochastic dynamical models, most selection techniques involve parameter estimation, 
which typically requires some form of optimization, exploration of the parameter space, or formal 
inference procedure [HIS]. For sufficiently complicated models, however, this task can become in- 
feasible, owing to the nonlinearity and multi-modality of the objective function (which penalizes 
any differences between the data and the model predictions), as well as the high dimensionality of 
the parameter space p]. 

Here, we present a framework for the discrimination of mass-action ODE models (and suitable 
generalizations thereof) that does not require or rely upon such estimated parameters. Our method 
(Fig. [1]) operates on steady-state data and combines techniques from algebraic geometry, linear 
algebra, and statistics to determine when a given model is incompatible with the data under all 
choices of the model parameters. The core idea is to use the model equations to construct a 
transformation of the original variables such that any set of steady states of the model under that 
transformation possesses a simple geometric structure, irrespective of parameter values. In this 
case, we insist that the transformed steady states lie on a plane, which we detect numerically; if 
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Figure 1. Parameter- free method for model discrimination (see text for details). 



the observed data are not coplanar under the transformation induced by a given model, then we 
can confidently reject that model. 

The idea of transformation to coplanarity is not new, but previous efforts were limited, in part, by 
its systematic detection and quantification. For example, in [3], it was necessary to first manually 
reduce the dimension of the transformed space to three so that coplanarity could be assessed 
visually. Other related research using similar methods include [5HZ]. The current work extends 
this by devising a numerical scheme for quantifying the deviation from coplanarity that generalizes 
to higher dimensions and allows for statistical interpretation. Thus, we provide a richer and more 
powerful framework for the application of this basic technique. Chemical reaction network theory 
(CRNT) [8l9j and stoichiometric network analysis [TO] likewise embrace a parameter-free philosophy 
and can also be exploited for model selection |llH13j . 

It is worth noting that our method provides a necessary but (generally) not sufficient condition 
for model compatibility: a model that is compatible with the data must provide a transformation to 
coplanarity, but a model that achieves coplanarity is not necessarily compatible, due to additional 
degrees of freedom introduced in the transformation process. This is in contrast to traditional 
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approaches based on parameter fitting, which provide a sufficient but not necessary condition since 
local extrema in the cost function surface may prevent a suitable fit. These two approaches are 
therefore complementary and can be used together for improved model selection. 

The remainder of this paper is organized as follows. First, we introduce the concept of steady- 
state invariants |3l[5], polynomials that vanish at steady state and which depend only on exper- 
imentally accessible variables. Then we illustrate how to use steady-state invariants to deduce 
coplanarity requirements for model compatibility and how to detect such coplanarity numerically; 
we also discuss invariants in the context of standard parameter fitting techniques. Next, we apply 
our method to models of multisite phosphorylation and cell death signaling. Finally, we end with 
some generalizations and concluding remarks. 

1. Steady-State Invariants 
Consider a chemical reaction network model 

N N 

(1) SijXj ^ s'ijXj, i = l,...,R 

in the species Xi, . . . , Xj\f, where Sij and s'^j are the stoichiometric coefficients of Xj in the reactant 
and product sets, respectively, of reaction i, with rate constant fej. Under mass-action kinetics, the 
model has dynamics 

R N 

(2) ^i = E%(4*-^^-^)n^fc''' i = h---,N, 

j=i k=i 

where Xj is the concentration of species Xi (throughout, we follow the convention that lowercase 
letters denote the concentrations of the corresponding species indicated in uppercase). These 
equations provide a quantitative description of the model and can, in principle, be used to test 
its validity by assessing the degree to which they are satisfied by observed data. Unfortunately, in 
practice, the required variables are rarely all available. In particular, the velocities x = (xi, . . . , i^r) 
can be difficult to measure, so we can often consider only the steady state x = 0, as we will do here. 
Furthermore, certain species may be experimentally inaccessible due to technological limitations; 
we eliminate these variables from the equations if possible. 

For simple models, this elimination can be done by hand, but a more systematic approach is 
required in general. One such approach is to use Grobner bases [H], a central tool in computational 
algebraic geometry that provides a generalization of Gaussian elimination for multivariate polyno- 
mial systems. Here, we follow the general procedure of Manrai and Gunawardena [4J. Let Q[a] be 
the polynomial ring consisting of all polynomials in the parameters a = (ki, . . . , kji) with coeffi- 
cients from the rational numbers Q, and let K be its fraction field, comprising all elements of the 
form f /g, where f,g G Q[a]. Clearly, each Xj € K[a;], the ring of all polynomials in a; = (xi, . . . , x^r) 
with coefficients in K. Note that the parameters a have been absorbed into the coefficient field K; 
thus, by performing all operations over K, we can treat a symbolically, i.e., without specifying any 
particular parameter values. 

To characterize the steady state a; = 0, we construct the ideal J = (x) generated by x, consisting 
of all polynomials YliLi fi^i, where each /j € K.[x]. Clearly, J contains all elements of ]K[x] that 
vanish at steady state. To obtain only those elements of J that do not depend on the variables 
xi,...,Xj, we consider the ith elimination ideal Jj = J n ]K[a;obs]; where a^obs = {^i+i, . . . ,xn) 
denotes the "observable" variables. Here, it is useful to introduce Grobner bases, which are special 
sets of generators with the so-called elimination property that ii g = {gi, . . . , qm) is a Grobner basis 
for J under the lexicographic ordering xi > • • • > xat, then J, = (fiTobs)) where g^^-^^ = g <^ ^[xohs] 
are precisely those elements of g containing only the variables Xobs* 

The polynomials g^y^^ generate 
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all elements of K[a;obs] that vanish at steady state and so characterize the projection of the steady 
state onto the variables a^obs- 

Procedurally, we compute a reduced Grobner basis g of J with respect to a suitable lexicograhic 
ordering using standard algorithms, then obtain g^^^ by subselection. For numerical convenience 
we further rescale each polynomial in g^^^ so that all coefficients belong to Q[a] (i.e., we multiply 
through by their common (lGnomina,tor) . Then the elements of ^obs 

= (/i, . . . , iNin^) have the form 

'^i -^obs 

(3) It (sobs ; a) = ^ (a) JJ xj:^" , i = 1 , . . . , N-.^^ , 

j=i k=i 

where we have applied the relabeling aiobs = (xi, ■ ■ ■ ,xn„^J- Clearly, each Ij is a polynomial in 
ajobs that vanishes at steady state; we call such polynomials steady-state invariants (or sometimes 
just invariants for short). 

Note in general that steady-state invariants may fail to exist since Jj may be empty. Moreover, 
invariants and their properties (e.g., degrees) can depend delicately on the choice of monomial 
ordering. Some manual intervention is therefore often required to obtain useful invariants. We 
will not treat this important (but subtle) issue here, instead focusing on the analysis of given 
invariants, however they are obtained. This also has the advantage of separating the computation 
of invariants from their interpretation, in principle allowing the use of invariants from various 
theories. Steady-state invariants, if they exist, describe relationships between observable variables 
that hold at steady state for any given realization of parameter values, regardless of other factors 
such as initial conditions. 

For full details on the computational procedure employed, see the accompanying Sage worksheet, 
which contains code for all computations performed (Materials and Methods). For further back- 
ground on algebraic geometry and Grobner bases (including the potential problems of obtaining 
them), see [TIJ; for other methods of variable elimination, see, e.g., [15j . Similar algebraic ideas 
have also appeared in the context of phylogenetics [HIIIT] . 

2. Model Discrimination 

We start with a set of steady-state measurements a;obs,i for i = 1, . . . ,m, and a given model with 
steady-state invariants = {/i, . . . , iNin^}- 

2.1. Data Coplanarity. An invariant, / S J^, can be written somewhat simplified as 

(4) I(^obs;a) = ^/,(a)n4'"- 

j=i k=i 

We first describe a procedure for deciding whether it is possible that the invariant is compatible 
with the data, i.e., 

(5) /(a;obs,i;a) = 0, i = l,...,m, 
for some choice of a. We therefore rewrite (jH) as 

n 

(6) I{y;b) = Y,bjyj, 

where yj = Yl^=i ^k'' = '"^i^h y = (yi,...,y„) and b = (6i, . . . , 6^). Let (p be 

the map taking a^obs to y. Then compatibility implies that the transformed variable y = ^^(^obs) 
corresponding to any observation x^hs, considered as a point in M" with coordinates {yi, . . . ,yn), 
lies on the hyperplane defined by the coefficients b. In other words, compatibility with the data 
a;obs,i implies that the corresponding transformed data y^ = c/?(^obs,i) are coplanar. 
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In general, it is possible that the invariant vanishes trivially, (6 = 0), under some choice of 
parameters for which coplanarity need no longer hold. To discount this case, we can check, for 
instance, that the denominator of the corresponding g^^^ is never zero. Then / always has at least 
one nonzero coefficient; hereafter in this section, we assume that the invariant is non-vanishing in 
this sense. 

Let Y G i^Tix" \)Q iiiQ matrix whose rows consist of the y^. Then the data are coplanar if and 
only if Yb = for some nontrivial column vector b ^ 0. Such a vector, by definition, resides in the 
null space of Y, which can be found using the singular value decomposition Y = U^V'^ , where 
the diagonal elements of XI give the singular values cij > encoding the "stretch" of each basis 
vector in V. In particular, the smallest siiigulcir value cTjjiij^ bounds the norm for any b ^ 

via 



(7) 



mm 

\\b\\=l 



\Yb\\ 



so if (Tmin > 0, then the data cannot be coplanar [18]. More generally, (Jmin gives the least squares 
deviation of the data from coplanarity under the scaling constraint = 1. This quantity depends 
only on the data and is therefore parameter-free. 

Note that this applies for any choice of 5, regardless of whether it can be realized by the original 
parameters a. In this sense, the condition of small (Tmin provides a necessary but not sufficient 
criterion for model compatibility. The additional degrees of freedom introduced by neglecting the 
functional forms fj effectively linearizes the compatibility condition ([5]), allowing for a simple, direct 
solution. 

To account for the presence of noise, suppose that we know each component Xk of a measurement 
iobs only up to an error Axk, with 



(8) 



obs ) 



where Z ~ A/'(0, 1) is a standard normal random variable. We imagine that the noise parameter e 
is given, for example, by instrument error. Then from the perturbation equation 



(9) 



y + Ay = if (xobs + AaJobs) , 



we find, expanding to first order, that the error is propagated to the transformed variables as 
^^obs) where V99 is the Jacobian of if, with elements (\/ip)ij — dyi/dxj. Therefore, 



(10) 



A^obs 



Ayj = ejZ, ej = e > (V99) Xfc. 



k=l 



We now consider the effect of the Ay^ on amm under the null hypothesis that the underlying y^ 
are coplanar with coefficients b (of unit norm). For this, we study the vector Yb, whose entries are 
perturbed from zero to 



(11) 



for each transformed datum y. Since ||b|| = 1 by assumption, if we rescale each row of Y by its 
corresponding effective error 



(12) 



eeff = max \ej\ > 

j=l,...,n 
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thus obtaining Y' , then each entry of Y'b has the form fiiZ with \ni\ < 1, for i = 1, . . . ,m. We 
hence define the coplanarity error 

(13) A = (j^in (1^0 < , 

which, from the discussion above, is bounded by the length of a normal random vector with variances 
isf < 1, whose distribution function clearly dominates that of the length of a normal random vector 
with variances fif = 1. But this latter quantity simply follows the x distribution with m degrees of 
freedom. In other words, 

(14) Pr(A>x) <Pr(X>x), X ^ Xm] 
if Pa is the upper a-percentile for Xm (e.g., a = 0.05), then 

(15) Pi{A>pa)<Pr{X>pa) = a, 

which gives an approximate criterion for rejecting coplanarity. As the amount of data increases, 
the approximation improves since o"inin(^') ll^'^'ll as m ^ oo by the symmetry of (jlOp . 

Depending on the exact situation at hand, it may be appropriate to choose a more conservative 
significance level a or to invoke additional criteria in order to decide whether a model is accept- 
able. In the examples below, however, we will see that whether a model can be rejected is often 
fairly obvious, and in such cases we will simply use the asymptotic arguments based on the Xm 
distribution. 

2.2. Invariant Minimization. Steady-state invariants can also be used in conjunction with stan- 
dard parameter fitting techniques. The basic approach is to minimize the Frobenius norm of the 
matrix 6 E ]R^invXm^ with entries dij = /i(a;obs,ji *^)) over the parameters a. This readily provides a 
sufficient condition for model compatibility since any a producing a small norm provides parameters 
that fit the data by construction. However, the condition is not necessary since suitable parameters 
may fail to be found even for compatible models due to the intricacies of the objective function. 
Clearly, prior knowledge of a can be used to guide the optimization away from such difficulties. 

Assuming that the model and its parameters are correct, each invariant I{xo\ys;a) = in prin- 
ciple. However, due to noise, I{xohs',CL) = ^{xohs] (1)2 , where 

n Afobs 

(16) e (xobs; a) = e ^ fj (a) ^ {Vip)jj^ Xk 

j=i k=i 

by ([TT]) . Therefore, if we use /(^obs! <^)/e(i^obs; <i) as the entry of 6 corresponding to invariant I 
and datum x^i^s, then the invariant error 

(17) e{a) = \\e{a)\\p^XN,,.^m- 

This can be used to compute the likelihood L{a) = Pr(0(a)) and allows, e.g., various likelihood- 
based selection schemes |19ll20j . assuming that the optimization can be performed. Here, we use 
the Akaike information criterion (AIC), 

(18) A = 2ii-21ogL^ax, 

where -Lmax = maxa-Z^(a), which penalizes model complexity; the preferred model is the one with 
the minimum AIC 1211. 



3. Results 



We apply our methods to two illustrative biological processes for which competing models exist: 
multisite phosphorylation and cell death signaling. 
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3.1. Multisite Phosphorylation. We focus first on phosphorylation, a key cellular regulatory 
mechanism that has been the subject of extensive study, both experimentally |22H24j and theoret- 
ically [^[5t l25H27] ■ Following [4j, we consider a two-site system with reactions, 

(19a) K + Su^KSu^K + S^, 

bu 

(19b) F + S.^FS.^F + Su, 

where u, f € {0, 1}^ are bit strings of length two, encoding the occupancies of each site (0 or 1 for 
the absence or presence, respectively, of a phosphate), with u having less bits than v; Su is the 
phosphoform with phosphorylation state u; K is a kinase, an enzyme that adds phosphates; and F 
is a phosphatase, an enzyme that removes phosphates. Each enzyme can be either processive (P), 
where more than one phosphate modification may be achieved in a single step, or distributive (D), 
where only one modification is allowed before the enzyme dissociates from the substrate (cooii = 
for K, 71100 for F). This mechanistic diversity generates four competing models: PP, PD, DP, 
and DD; where the first letter designates the mechanism of the kinase, and the second, that of the 
phosphatase. 

As in [3], we consider only the concentrations a^obs = (•SOO) •soi, sio, sii) as observable and use the 
ordering, 

(20) {ksQo, ksQi, ksio, fsoi, fsio, fsii,k, f, soo, soi, sio, sii), 

with which we are able to eliminate all other variables except / from the dynamics of each model. 
The remaining Grobner basis polynomials are of the form p(/, aJobs) = / ■ ^(iCobs)) where / 7^ 
unless there is no phosphatase in the system, which we assume not to be the case, so we take only 
the observable part (/(a^obs)- It is easy to check that the resulting denominators are always of one 
sign. 

Each model has three steady-state invariants. Matched appropriately, the invariants for model 
PP share the same transformed variables y = (^(a^obs) as those for PD; the same is true for DP and 
DD. Thus, in terms of the transformed data, only the kinase mechanism is discriminative. Between 
PP/PD and DP/DD, two invariants (/i and I2) are discriminative in principle, though only one 
{I2) succeeds numerically: for simulated data from the PP/PD models, provided that the noise 
level is sufficiently low, lack of coplanarity on I2 is able to correctly reject the DP/DD models at 
significance level a = 0.05 (A ~ 10^ versus A ~ 1 for PP/PD at e = 10~^, against a threshold 
of Pa = 11.2). The corresponding test using DP/DD data is not successful due to the form of I2, 
which has transformed variables, 

(21a) yPP/PD ^ (soosio, soosii, soisio, soisn, Sio, siosii) , 

(21b) yDP/DD = (SOOSII, SOlSlO, SOlSll, SlO' •SloSll) 

for PP/PD and DP/DD, respectively, i.e., yPP/PP" has the additional variable soo^io over yP*P/DD. 
Therefore, PP/PD models can be made to fit DP/DD data simply by setting the coefficient corre- 
sponding to soO'Sio to zero, which is in fact what we observe. No model is rejected on the basis of 
data generated from it. 

We emphasize that these results are specific to the particular ordering chosen. Indeed, one can 
make the phosphatase mechanism discriminative instead by reversing the order of the variables 
Xohs in (|20p . The exhaustive analysis of such orderings is beyond the scope here; rather, we aim to 
illustrate the potential uses (and usefulness) of this type of approach using concrete examples. 

Although the condition of coplanarity is technically valid only at steady state, there should never- 
theless be some convergence over time to coplanarity for any compatible model. We hence compute 
A for the PP/PD and DP/DD models along time course trajectories simulated from model PP at 
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various levels of e (Fig. [2]A.). For low noise, the results confirm convergence for invariants previously 
identified as compatible (all Ij for PP/PD; Ii and for DP/DD), with stagnation for incompatible 
invariants {I2 for DP/DD); this does suggest wider applicability of this method, provided that the 
data are approaching steady state reasonably fast. As the noise increases, however, A decreases 
inversely proportionally, until the stagnation point hits the basal error level of A ~ 1 and we lose 
all power to reject. Additional simulations estimate the critical noise level at e ~ 10~^ (Fig. EJB). 

To further discriminate between all four models we next turn to invariant minimization. The 
required optimization involves highly nonlinear functions, so success should be expected only if we 
have good initial estimates of the model parameters. This is a rather strong demand. In such a 
case, however, minimization is indeed capable of identifying the correct model from the data so 
long as e < 10^^ (Fig. EJC). These results reinforce our belief that the algebraic approach proposed 
here naturally complements conventional (i.e. parametric) reverse engineering schemes such as 
optimization or inference procedures. 

3.2. Cell Death Signaling. We next apply our methods to receptor-mediated cell death signaling, 
the so-called extrinsic apoptosis pathway, which plays a prominent role in cancers and other diseases 
[28l - [3T] . Specifically, we consider the assembly of the death- inducing signaling complex (DISC), a 
multi-protein oligomer formed by the association of FasL, a death ligand, with its cognate receptor 
Fas [321133]. 

We investigate two models of DISC formation. The first [M], which we call the crosslinking 
model, is based on the successive binding of Fas (R) to FasL (L), 

3k f 

(22a) L + R^Ci, 

(22b) Ci + R^C2, 

(22c) C2 + R^C3, 

where is the complex FasL:Fasj. The second [6], which we call the cluster model, posits three 
forms of Fas (inactive, X; active and unstable, Y; active and stable, Z) and specifies receptor 
cluster-stabilization events driven by FasL, 

(23a) X^Y, 
(23b) Z ^Y, 

(23c) jY + {i-j)Z^ -k)Y + {i-j + k)Z, 

(23d) L + jY + {i - j) Z ^ L + {j - k)Y + {i - j + k) Z, 

where the last two reactions represent entire families generated by taking i = 2 or 3, with j = 1, . . . ,i 
and k = 1, . . . ,j. The cluster model is capable of bist ability, whereas the crosslinking model exhibits 
only monostable behavior [6]. 

The two models are structurally very different, and discriminating between them requires some 
care. Hence, following [6], we establish a correspondence between the models by considering the 
apoptotic signal ( transduced by the DISC, defined as C = ci + 2c2 + 3c3 for the crosslinking 
model and C = z for the cluster model. We assume that (" is experimentally accessible; other 
variables assumed accessible include A, the total concentration of FasL (A = / + ci + C2 + C3 and 
A = / for the crosslinking and cluster models, respectively), and p, the total concentration of Fas 
(p = r+ci+2c2+3c3 and p = x+y+z, respectively). Eliminating all other variables via the orderings 
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Figure 2. Discrimination of multisite phosphorylation models. (A) Coplanarity 
error A of the steady-state invariants of the PP/PD (left) and DP/DD (right) models 
along time course trajectories simulated from the PP model, corrupted by various 
levels of noise (lined, e = 10~^; dashed, e = 10~^; dotted, e = 10^'^). At each noise 
level, the errors for three invariants are shown (blue, Ii; green, I2; red, (B) 
Coplanarity error A of DP /DD invariants on PP data at steady state as a function 
of the noise level e; invariants colored as in (A). The shaded region indicates the 
regime over which the DP/DD models can be rejected at significance level a = 0.05. 
((7) Invariant error AIC A for each model (blue, PP; green, PD; red, DP; cyan, DD) 
on data generated from the PP (top left), PD (top right), DP (bottom left), and 
DD (bottom right) models. 

(c2, C3, A, p, C) and (y. A,/?, C) for the crosslinking and cluster models (after appropriate variable 
substitutions) we obtain one non-vanishing steady-state invariant for each model. The dimensions 
of the transformed spaces are 5 and 15 for the crosslinking and cluster models, respectively. 

As for phosphorylation, we compute the coplanarity error for each invariant on time course data 
simulated from each model at various noise levels. Although results are inconclusive for data from 
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Figure 3. Discrimination of cell death signaling models. {A) Coplanarity error A of 
the steady-state invariants of the crosslinking (left) and cluster (right) models along 
time course trajectories simulated from the cluster model, corrupted by various levels 
of noise (blue, e = 10~^; green, e = 10-6; red, e = lO'^). (5) Coplanarity error A of 
model invariants (blue, crosslinking; green, cluster) on cluster data at steady state 
as a function of the noise level e. The shaded region indicates the regime over "which 
the crosslinking model can be rejected at significance level a = 0.05. (C) Invariant 
error AIC A for each model (blue, crosslinking; green, cluster) on data generated 
from the crosslinking (left) and cluster (right) models. 



the crosslinking model, the coplanarity criterion can reject the crosslinking model on the basis of 
cluster model data at a = 0.05, provided that e < 10"^ (Fig.[3]A and B). The minimization protocol 
also correctly identifies the model from the data over the same range of noise levels (Fig. [3lG). 



4. Discussion 

In this paper, we have presented a novel model discrimination scheme based on steady-state 
coplanarity that does not require kno'wn or estimated parameter values. Thus we are able to 
sidestep the parameter inference problem common to many fields including systems biology [3tl35j . 
Such algebraic methods are not always effective, ho"wever: steady-state invariants may not exist, 
and even when they do, the additional degrees of freedom introduced by effective linearization 
can cause the method to fail. A promising solution to the problem when invariants cannot be 
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calculated using Grobner bases may be to employ invariants from CRNT |36j . Our results also 
suggest a somewhat low tolerance for noise, which can restrict its applicability. Significantly, our 
method has the unique feature that it can be applied with complete ignorance of parameter values, 
and is therefore a useful additional tool in the analysis of inverse problems involving dynamical 
systems. 

Rather than competing directly with current model discrimination techniques, we expect that 
coplanarity will form one end of an entire spectrum of methods, to be used when no parameter 
information is available. At the other end lie methods based on parameter estimation (including 
invariant minimization), which, for dynamical systems, can depend delicately on qualitative and 
quantitive aspects of the systems under consideration [371|38]. The intermediate regime comprises 
techniques that can leverage partial knowledge, for instance, constraints on certain parameter values 
or qualitative features of the dynamics [39]. Along this spectrum, naturally, the discriminative 
power increases with the amount of prior information available. In this broader context, coplanarity 
can be used to efficiently reject candidate models before employing more demanding parameter 
estimation tools. Thus, it can serve as a preprocessor to thin out the model space. The real 
advantages and limitations of any inferential procedure become apparent once their performance 
can be evaluated in real-world applications. This is perhaps particularly true for this current 
approach. Certainly a range of theoretical and computational issues surround algebraic methods 
which will likely impact their applicability. Here we have found that a pragmatic approach yields 
some useful insights for small and intermediate-sized problems. 

Finally, we remark that the presented scheme is but the simplest of a potential new class of 
parameter-free selection methods based on the detection of geometric structure. In this view, 
transformation to coplanarity is just one of many low-dimensional descriptions of such structure. 
The existence of low-dimensional representations has recently been predicted in neuronal signaling 
[30] , and can ultimately be attributed to the inherent robustness of biological systems [1T1I32] . 



5. Materials 



5.1. Grobner Basis Calculation. All reduced Grobner bases are computed over the field K of ra- 
tional functions in the parameters a with rational coefficients, under a suitable lexicographic order- 
ing with the observables cCobs located at the end of the variable list, using the computer algebra sys- 



tem Singular (http://www.singular.uni-kl.de/ ) as interfaced through Sage (http:/ /www. sagemath.org/). 



5.2. Data Generation. For each model parameters are drawn independently from a log-normal 



distribution with median fi* = = 1 and multiplicative standard deviation a* 



2, where 



fi and a are the mean and standard deviation, respectively, of the underlying normal distribution. 
Using these parameters m = 100 time course trajectories are computed for each model via integra- 
tion of the model ODEs over the time interval < t < 100; each trajectory is seeded by random 
initial conditions sampled from a log-normal distribution also with //* = 1 and a* = 2. Integration 
is performed using the solver LSODA as wrapped in SciPy (http:/ /www.scipy.org/). The data are 
then corrupted by noise of varying levels from e = 10^^ to 10 
data by random log-normal samples with ^* = 1 and 



for each e, multiplying the nominal 
1 + e. 



5.3. Invariant Minimization. Invariant error likelihood maximization is performed in two phases. 
First, an approximate optimal parameter set is obtained by minimizing the Frobenius norm of the 
matrix 77 G ]^MnvXm^ where each entry corresponds to an invariant-datum pair as in 6, but with 
values I{Xohs'i'^)/M{xohs]o,), where 



(24) 



M{xohs;a) 



Nobs 



/. («) n 



k=l 
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This is then taken as an initial parameter estimate to compute -Lmax- All optimizations are per- 
formed using L-BFGS-B [33] through SciPy, with lower and upper bounds of 0.01 and 100, respec- 
tively, for each variable. The minimization of is seeded with initial value 1 for all variables. 

5.4. Computational Platform. All computations are performed centrally in Sage, making use of 

its interfaces to various programs. Plots were produced using matplotlib (http: //matplotlib. sourceforge.net/ ). 

The Sage worksheet for this paper, which contains code for all computations performed, is available 

at http:/ /www. sagenb.org/home/pub/3462/. 
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